Skip to content

Conversation

@0marperez
Copy link
Contributor

@0marperez 0marperez commented Oct 17, 2025

Issue #

N/A

Description of changes

  • Project setup
  • Publishing config
  • Transfer manager client
  • Business metric
  • Transfer interceptors
  • Upload file operation
    • Concurrent uploads
    • MPU part buffering
    • Code generated IO
    • Code generated type converters

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@0marperez 0marperez added the no-changelog Indicates that a changelog entry isn't required for a pull request. Use sparingly. label Oct 17, 2025
@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@0marperez 0marperez marked this pull request as ready for review October 17, 2025 15:14
@0marperez 0marperez requested a review from a team as a code owner October 17, 2025 15:14
Copy link
Member

@lauzadis lauzadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice start

@github-actions
Copy link

A new generated diff is ready to view.

  • No codegen difference in the AWS SDK

@github-actions
Copy link

A new generated diff is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

@lauzadis lauzadis mentioned this pull request Oct 30, 2025
1 task
@github-actions
Copy link

A new generated diff is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

val targetNumberOfParts = contentLength / targetPartSize
return if (targetNumberOfParts > MAX_NUMBER_PARTS) {
ceilDiv(contentLength, MAX_NUMBER_PARTS).also {
logger.warn { "Target part size is too small to meet the $MAX_NUMBER_PARTS S3 part limit. Increasing part size to $it" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we clarify with the spec author what level this should be logged at?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me ask

Copy link
Contributor Author

@0marperez 0marperez Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing in the spec mentioning logging a message when the configured part size isn't used btw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec author uses DEBUG

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I like INFO level for this. The KDocs already indicate that the value is a target not a guarantee.

That said, I think we should make the message more descriptive. Ideally we'd say something like:

The target part size of <configured-value> bytes is too small upload <object-name> in <max-num-parts> parts (the maximum allowed by S3). The object will be uploaded in parts of <calculated-part-size> bytes instead.

@github-actions
Copy link

A new generated diff is ready to view.

Comment on lines +94 to +99
if ("s3".isBootstrappedService) {
include(":hll:s3-transfer-manager")
include(":hll:s3-transfer-manager-codegen")
} else {
logger.warn(":services:s3 is not bootstrapped, skipping :hll:s3-transfer-manager and subprojects")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: Excellent!

Comment on lines +95 to +96
smithy-kotlin-http-test-jvm = { module = "aws.smithy.kotlin:http-test-jvm", version.ref = "smithy-kotlin-runtime-version" }
smithy-kotlin-testing-jvm = { module = "aws.smithy.kotlin:testing-jvm", version.ref = "smithy-kotlin-runtime-version" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why are we explicitly depending on JVM target packages? Generally we rely on the common KMP packages (e.g., http-test, testing) because Gradle's Kotlin plugin is supposed to handle target resolution.

Comment on lines +115 to +116

"s3-transfer-manager-codegen", // TODO: Disable publishing ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: 👍 Yes, this is in the right place. We don't want to publish this since we have no use case for it right now. We can scratch the TODO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't disable publication right? Just API validation and docgen

Comment on lines +60 to +70

java {
sourceCompatibility = JavaVersion.VERSION_1_8
targetCompatibility = JavaVersion.VERSION_1_8
}

tasks.withType<KotlinCompile> {
compilerOptions {
jvmTarget.set(JvmTarget.JVM_1_8)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why are these additions necessary?

Comment on lines +40 to +71
val mpuUploadId = initiateTransfer(
multipartUpload,
transferContext,
contentLength,
uploadFileRequest,
interceptors,
client,
)

val uploadedParts = transferBytes(
multipartUpload,
contentLength,
partSizeBytes,
logger,
uploadFileRequest,
transferContext,
mpuUploadId,
interceptors,
client,
maxInMemoryParts,
maxConcurrentPartUploads,
)

completeTransfer(
multipartUpload,
transferContext,
uploadFileRequest,
mpuUploadId,
uploadedParts,
interceptors,
client,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: The volume of arguments passed into these functions is too great. Can any of these be grouped into objects, derived from other parameters, etc.? This level of data coupling is an indicator that we might be better served modelling a base operation type which can be implemented for each operation type (e.g., UploadFile) or for each subtype (e.g., UploadFileSingle and UploadFileMultipart), which would reduce the amount of if (multipartUpload) calls.

"tagging",
"websiteRedirectLocation",
),
additionalLogic = "contentLength = this@toPutObjectRequest.body?.contentLength",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ByteStream's contentLength is nullable because it's not always possible to derive the length of a byte stream automatically. That's why S3's PutObjectRequest has a separate contentLength field users can populate with knowledge the SDK doesn't necessarily have. I think we need contentLength to be user-configurable and to prefer that value when it's set (otherwise, falling back to body?.contentLength).

Comment on lines +37 to +46
/**
* Represents a part in a multipart upload.
*
* @param number The part number.
* @param bytes The bytes of the part.
*/
internal data class Part(
val number: Int,
val bytes: SdkBuffer,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could be private.

Comment on lines +137 to +139
) = produce(
capacity = maxInMemoryParts,
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness: Isn't maxInMemoryParts supposed to limit the parts for the entire S3TM? This looks like it applies to individual objects but we'll be parallelizing multi-object transfers.

Comment on lines +94 to +108
} catch (uploadPartException: Exception) {
try {
client.abortMultipartUpload {
bucket = uploadFileRequest.bucket
expectedBucketOwner = uploadFileRequest.expectedBucketOwner
key = uploadFileRequest.key
requestPayer = uploadFileRequest.requestPayer
uploadId = mpuUploadId
}
throw S3TransferManagerException("Multipart upload failed (ID: $mpuUploadId). One or more parts could not be uploaded", uploadPartException)
} catch (abortException: Exception) {
throw S3TransferManagerException("Multipart upload failed (ID: $mpuUploadId). Unable to abort multipart upload.", abortException)
.also { it.addSuppressed(uploadPartException) }
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This should probably log a WARN before aborting the multipart upload.

Comment on lines +115 to +116

"s3-transfer-manager-codegen", // TODO: Disable publishing ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't disable publication right? Just API validation and docgen

Comment on lines +11 to +30
internal val uploadFileConversions = listOf(
ConversionMapping(
source = TypeRef(
"aws.sdk.kotlin.services.s3.model",
"PutObjectResponse",
),
destination = TypeRef(
"aws.sdk.kotlin.hll.s3transfermanager.model",
"UploadFileResponse",
),
setOf(
"bucketKeyEnabled",
"checksumCrc32",
"checksumCrc32C",
"checksumCrc64Nvme",
"checksumSha1",
"checksumSha256",
"checksumType",
"eTag",
"expiration",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a data string will make this any more readable / maintainable? The internal spec already has this modeled as a JSON list, we can make changes (which I believe are unlikely) by inspecting the diff of that file

Comment on lines +36 to +41
commonTest {
dependencies {
implementation(libs.smithy.kotlin.http.test.jvm)
implementation(libs.smithy.kotlin.testing.jvm)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ian already commented on this, but JVM-only dependencies in commonTest won't work

Comment on lines +23 to +30
/**
* Preferred part size for multipart uploads.
* If using this size would require more than 10,000 parts (the S3 limit),
* the smallest possible part size that results in 10,000 parts is used instead.
*
* Default to 8,000,000 bytes.
*/
public val partSizeBytes: Long = builder.partSizeBytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my fault since I left a comment asking to change it to just partSize to simplify the name. We are logging a warning when deviating from the configured part size. It's not a strong opinion so I will let @0marperez make the decision

#1712

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-changelog Indicates that a changelog entry isn't required for a pull request. Use sparingly.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants